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Abstract 

We consider the problem of assigning a meaningful degree of belief to uncertainty estimates of 
O ! perturbative series. We analyse the assumptions which are implicit in the conventional estimates 

^ made using renormalisation scale variations. We then formulate a Bayesian model that, given 

equivalent initial hypotheses, allows one to characterise a perturbative theoretical uncertainty in 
a rigorous way in terms of a credibility interval for the remainder of the series. We compare its 
outcome to the conventional uncertainty estimates in the simple case of the calculation of QCD 
corrections to the e + e~ — > hadrons process. We find comparable results, but with important 
conceptual differences. This work represents a first step in the direction of a more comprehensive 
and rigorous handling of theoretical uncertainties in perturbative calculations used in high energy 
in phenomenology. 

in 

in 
o 



i 



Contents 



1 Introduction [2] 

2 Theoretical uncertainty estimates 3] 

2.1 Conventional theoretical uncertainty estimate [5] 

2.2 Credibility-based theoretical uncertainty estimate [B] 

2.2.1 The model 

2.2.2 Conditional densities /(c|co, ■ ■ ■ , C&) and /(c n |co, . . . , Ck), n > k [9] 

2.2.3 Conditional density /(Afe|co, . . . , Cfc) HU 

3 Comparison with the conventional method 1121 

4 Series starting at non-zero order a l s 1141 

5 A realistic application: e + e~ — > hadrons 1151 

6 Discussion about the hypotheses of the model 1171 

6.1 Choice of the density function f(c n \c) [T7J 

6.2 Choice of the density function /(lnc) [18] 

6.3 Choice of the expansion parameter [18] 

7 Partially known higher orders 1191 

8 Conclusions and outlook 1211 
A Partial cross section, coefficients and renormalisation scale 
B Derivations of density distributions and uncertainty intervals 



B.l Derivation of f(c n \co, . . . , e&) in eq. (24) 



B.2 Derivation of f(c\co, . . . , c&) in eq. (23) 



B.3 Derivation of the smallest p%-credible interval in eq. (36) 



B.4 Derivation of the approximate /(A^|co, . . . , c^) in eq. (37) 



B.5 Derivation of the degree of belief of the scale variation bands in eq. (43) 



B.6 Derivation of /(c n | cq, Cfc, Cfc + i) in eq. (77) [27] 



1 Introduction 

The Large Hadron Collider (LHC) has finally been fired up and, no black hole having swallowed 
the Earth, the race to collect data and analyse them has now started in earnest. While the short 
term goal is to rediscover the Standard Model, the long term one will of course be to find signals of 
"new physics" , be it the Higgs boson, supersymmetry, or something else, more exotic and possibly 
unexpected. While it is everybody's hope that discoveries will announce themselves in the form 
of unambiguous signals, it is of course conceivable, and probably also unavoidable initially, that 
they may rather present themselves cloaked under some subtle data/theory discrepancy. If this 
is the case, a full control of the uncertainty of the theoretical predictions becomes naturally of 
paramount importance: when comparing an experimental measurement to a theoretical calculation, 
we must be able to say if they agree or not, and with what degree of confidence we are making such 
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statement. This is impossible to achieve unless both the experiment and the theory are provided 
with a meaningful (and commonly accepted) degree of uncertainty. 

While most of what we discuss below can apply to any kind of theoretical prediction in per- 
turbation theory, we will specialize it to the context of Quantum Chromodynamics (QCD): many 
LHC processes and backgrounds pertain to the QCD realm and, due to the relatively large size of 
the QCD coupling a s and therefore the slower perturbative convergence, the issue of theoretical 
accuracy is more pressing. Theoretical predictions in QCD contain multiple ingredients, inputs that 
must be ultimately extracted from experimental data, like Parton Distribution Functions (PDFs) 
for hadronic collisions and the value of a s . In the past several years a lot of progress has been made 
on a s and the PDFs. The uncertainty with which we know the coupling is now quite small (see 
for instance [I]). Moreover, several groups [2j 01 [5j [6] have extracted PDF sets with associated 
uncertainties of experimental origin, and provided frameworks to properly propagate them to the 
observable one is calculating. Huge progress has also been made in performing higher order per- 
turbative calculations for a large number of phenomenologically interesting observables |7], thereby 
potentially improving the accuracy with which they are known. 

One area where progress has instead arguably not been made is in an understanding of the 
meaning of the residual theoretical uncertainty, given by unknown higher orders in perturbation 
theory. This uncertainty is usually estimated by varying unphysical momentum scales (we will 
denote them collectively by fi) contained in the perturbative result, like the renormalisation and the 
factorisation scales, around a central value /iq, usually taken to coincide with a physical momentum 
scale Q of the process. The method, the range in which to vary the scales (typically [fio/2, 2fJ>o\), 
and their central value [io = Q are highly conventional, but nevertheless quite commonly accepted. 
They allow the community to efficiently exchange a conventional uncertainty which can be easily 
compared between different calculations. 

Among the shortcomings one may find in this procedure, the most glaring one is probably that 
it does not allow one to estimate the degree of belief (DoB) of the resulting uncertainty band. By 
this we mean that it is not possible to associate a value, 68.3%, 95.5% or 99.7% for example, to 
our belief that the uncertainty band contains the true sunQ of the series. This lack of a proper 
characterisation of the perturbative theoretical uncertainty also means that procedures to combine 
it with other sources of uncertainties (e.g. the value of the coupling and the PDFs) are at best 
ambiguous or controversial, as exemplified by a recent discussion [8] about the proper way to 
estimate the total uncertainty of the prediction for the Higgs production cross section at hadron 
colliders. All this makes it potentially impossible to fully and rigorously assess our degree of belief 
that an experimental result may agree (or not) with theory, making betting (or, in order to cover 
both sides, offering odds) on new physics having been discovered or not an altogether unscientific 
- and potentially risky - proposition. 

The purpose of this paper is precisely to try to make such a bet potentially safe, consistently 
with the coherent bet idea of de Finetti [9]. To achieve this we construct a model that leads to 
a well defined credibility measure for a perturbative theoretical uncertainty, so that the degree of 
belief of a given interval can be explicitly calculated. In section [2] we first review the commonly 
used theoretical uncertainty estimation via unphysical scales variations, and subsequently proceed 
to define the Bayesian model from which we then extract our credibility distributions. Section [3] 

1 We forego here the fact that QCD series are usually not convergent but simply asymptotic. The onset of 
the asymptotic behaviour usually taking place at fairly high perturbative orders, it normally does not affect realistic 
phenomenological applications. In practice, the place of "true sum of the series" can be taken by the asymptotic value 
of the series calculated with an appropriate prescription, or even by some more refined higher order result, though 
we will keep (mis)using the term "true result" to mean the desired result beyond what has been really calculated. 



3 



compares the results of our credibility-based model with those of the conventional method, allowing 
one to assign a degree of belief to the uncertainty bands given by the latter. Some results for 
the e + e~ — > hadrons process are given in section [5J to better illustrate the model with a realistic 
example. Section [6] discusses some of the hypotheses that were made in building the model, and 
Section [7] extends it to the case of partial knowledge of higher order coefficients. 

Before closing this introduction we wish to stress the following point: we are not trying to 
improve our knowledge of a perturbative prediction by adding physical information or even just 
speculations about its form, or by (improbably) seeking physical content inside the mathematical 
formalism: the only information that enters the result is what has been explicitly calculated, i.e. 
the known coefficients of a perturbative series. To this information we add hypotheses meant to 
formalize assumptions that are often implicitly made when estimating theoretical uncertainty using 



scale variations, and we use the framework of Bayesian probability (see section 2.2) for computing 
from them and from the available information the degree of belief of given uncertainty intervals. 
The hypotheses need not even be strictly true (or people may disagree about them), but once they 
are made the path to the calculation of the degree of belief values is a rigorous one. 

2 Theoretical uncertainty estimates 

For definiteness, consider the perturbative calculation for the cross section of a process taking place 
at a hard scale Q (see footnote [l] for a comment about the asymptotic nature of QCD series): 

oo 

a(Q) = ^cn(Q,m)<x n s (m), (i) 

n=0 

where fiR is the renormalisation scale (which we shall in the following simply denote by fj,), and the 
coupling a s (/j,) evolves according to 

A 00 

^ = /J(a.) = -a25>a». (2) 

" n=0 

A concrete example would of course be the production of hadrons in e + e~ collisions. When no 
dependence is given explicitly, the coefficients and the coupling will be considered to be evaluated 
at a renormalisation scale /i = Q: 

oo oo 

a(Q) = c n (Q, Q)a n s {Q) = £ c n a n s . (3) 

n=0 n=0 

Given c n = c n (Q, Q) independent of fx, one can always reinstate the full fj, dependence and determine 
c n (Q,n) using 

c n (Q, n) = Cn > 1 y n Q2j ( 4 ) 

where c n o = c n and 



C n ,l = J Y^jPn-l-jCjJ-l (5) 
3=0 

(see appendix [A] for a derivation) . Note that this last equation uses all the coefficients Cj with 
j < n. 
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We will also denote by 

k 

&k{Q, ^) = ^2 Cn ^' /^""(z 1 ) ( 6 ) 

n=0 

(or, for short, a k = ^J =0 c„a" for \i = Q) the partial sum up to the last calculated perturbative 
order k, and by 

oo 
n=k+l 

the remainder. 



2.1 Conventional theoretical uncertainty estimate 

The explicit [i dependence of a k (Q,fi) in eq. §6§ serves as a reminder that, when truncated to 
a finite order, a perturbative calculation retains a higher-order dependence on the scale /u. This 
dependence is generally exploited to estimate its "uncertainty"^} i.e. the presumed value of A k . In 
order to do so, one typically quotes an uncertainty interval [a^ ,& k ] around a k (but not necessarily 
centred on it). The specific choices for can vary. Possible options are: 



1. 



a, = mm{a k (Q,Q/2),a k (Q,2Q)} 



a+ = max{a fe (Q, Q/2),a k (Q, 2Q)} (8) 



2. 



3. 



where 



°k = r m P Mk{Q,v)} 

M e[Q/2,2Q] 



4. Same as eq. (10), but with 



+ 



max {cr k (Q,n)} 

M G[Q/2,2Q] 



5 k = \a k (Q,2Q)-a k (Q,Q/2)\ 



8 k = max {a k (Q,ii)} 

A1 e[Q/2,2Q] 



min {a k (Q,n)} 

M e[Q/2,2Q] 



(9) 

(10) 
(11) 

(12) 



In the last two cases the interval is centred on a k {Q, Q), whereas in the first two it is not necessarily 
so. Note also that the choice of varying the scale \x within a factor of two around the physical scale 
Q, i.e. in the range [Q/2,2Q], is fully conventional. 

A priori there is no reason why the interval [cjT,^] should represent a sensible estimate of the 
remainder A k of the series since, from a purely mathematical point of view, 5 k (or a k ) does not 
contain any information about A^: cr k (Q,[i) and 5 k are functions of the c n for n < k, while A/% is 
a function of the c n for n > k. However, the reason why this can instead often work in practice is 
that, under certain circumstances, the size of 5 k can be similar to the size of A k . One can indeed 
show that (see appendix |A|) 



4 - 



da k 



din /i 2 



[ln(2Q) 2 - ln(Q/2) 2 ] ~ 3kf3 a 



k+1 



(13) 



Of course it is not so much at which is "uncertain" , in that it is perfectly well determined by the knowledge of 
its coefficients and parameters, but rather the true value of the series and therefore to what extent Ofc describes it. 
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where the factor of 3 in the last term has been obtained by approximating the exact expression 
4 In 2 (This factor would be replaced by 4 In r if the scale \x were varied in the range [Q / r, rQ] ) . The 
last equality above is obtained by making the assumption that all the coefficients in the series share 
the same magnitude and that a s is reasonably small. Under these same hypotheses (and therefore 
|cfc + i| ~ |cfc|), we can also write 

|A fc | ~a!? +1 |cfc+i| ~£ fc . (14) 
Experience with perturbative calculations in QCD has shown that theoretical uncertainty estimates 



like those of eq. ( 10 ) are quite successful in predicting the range in which a higher order result will 
fall. This can then be seen as an empirical validation of the assumption made above, i.e. that j c^-t-i | 
is indeed often of the same magnitude as |cfc|. 

The limitation of this conventional approach is that, even if the hypothesis |cfc + i| ~ \c^\ is 
correct and therefore 5k correctly describes the size of the remainder of the series, there is no way 
of deciding how reliably it may do so. 



2.2 Credibility-based theoretical uncertainty estimate 

In this paper we use the "Bayesian probability" (also called "subjective probability" or "degree 
of belief" or "credibility", see e.g. |10j). and distinguish it from the "frequentist probability". 
The two concepts share the same mathematical formalism, but are nonetheless distinct. Bayesian 
probability is not linked to an infinite number of realizations of an experiment. It deals with 
a particular question, which may or may not be about the result of one particular realization 
of a given experiment, and the consequences of the information one considers about its possible 
answeij^ This information is not necessarily rigorous or "true" in any way, but its treatment, 
once translated mathematically into the so called "priors" and "likelihoods", is. A distribution of 
frequentist probability (or, for instance, its variance) gives a measure of the reproducibility of an 
experiment. Conversely, a credibility distribution conveys information about the uncertainty of the 
answer to a question, for instance the result of one particular realization of an experiment, prior to its 
execution. The variables appearing in a frequentist probability distribution are commonly denoted 
as random variables, since they take different values in different realizations of the experiment. We 
call instead uncertain variables the ones in a credibility distribution, to better make the distinction 
with the former ones: their values are not random (each of them being a single number), but simply 
unknown. 

Given a density function /, the degree of belief (or "credibility") that the value of an uncertain 
variable rj belongs to the interval [a, b] is then equal to 

Cfa€[a,6])= [ b f(v)dv- (15) 

J a 

where the result is a number between zero and one. □ 

3 One may build his initial credibility distribution using information of frequentist origin: e.g. after throwing an 
unbiased six-sided dice a large number of times (and hence establishing a frequentist probability), one can come 
to believe (i.e. define a credibility measure) that there is a one-in-six chance that a given number will show up 
in a subsequent throw (i.e. set the credibility measure equal to the frequentist probability previously established). 
However, information of non-frequentist origin can also be included in a credibility-based approach: if someone is told 
that the dice is likely crooked, they can then adjust their expectations (degree of belief) using this information, even 
before throwing the dice a single time. 

4 Note that, while this may seem similar to the "confidence level intervals" of frequentist statistical analyses, our 
intervals are to be understood strictly within a Bayesian framework (where they can also be called "credible intervals" 
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In this paper we will always work within the concept of degree of belief as defined above, and 
will never use the frequentist probability. The latter is not applicable to the case of a theoretical 
uncertainty, which is not amenable to a frequentist treatment (there is nothing one can "repeat") 
and is much more akin to a systematic uncertainty instead. 



2.2.1 The model 



The goal of this paper is to establish a conditional density /(Afc|co, • • . , c&) for the value of remainder 
of the series in eq. Q, given the knowledge of the coefficients of the perturbative expansion 
up to order k, and study its behavior. The reason for introducing a density function is that it 



contains much more information than a simple uncertainty band like the one established in eq. ( 10 ) 
in section |2~T1 

To achieve this we will create a generic credibility measure, applicable to any possible perturba- 
tive series, over the space of a priori unknown coefficients Cq, ci, More precisely, we will create 

a density function /(co, c\, . . . ), normalised such that 

J f(c ,ci, . . .) dc dci--- = 1 (16) 

and whose parameters can be marginalised according to 

/(c , . . . , Ci_i, Cj+i, . . . ) = / /(co, • . • , Ci_i, c i: Cj+i, . . . ) dci . (17) 



/(cq, . . • , c k ) = J /(cq, ci, . . . ) dc fc+ i dc fc+ 2 .... (18) 



In the case of one particular physical process, some coefficients will have been already computed up 

„true „tri 
-0 ' • • • > c k 



to order k: c^ ue , . . . ,c k vue . The credibility measure for this particular process will be the inherited 



measure defined on the subspace corresponding to Co = Cg me , c\ = c^ rue , . . . , C& = c k rue . For brevity, 
we will use the notation /(cfc+i|co, . . . ,c k ) instead of /(cfc+i|co = CQ rue , . . . , c& = c^ rue ). The density 
over the still unknown c k +i coefficient will then read, according to the standard conditional density 
rule, 

f{C k+ l C0,...,Cfe) = r . (19) 

/(Co, . . . ,c k ) 



To create this generic measure, we focus on the observation made at the end of section 2.1 
(i.e. that the empirical success of conventional theoretical uncertainty estimates made using scale 
variations can be explained by the fact that successive perturbative coefficients have similar magni- 
tudes) and reverse it. We make the assumption that all the coefficients c n in a perturbative series 
share some sort of upper bound c > to their absolute values, specific to the physical process 
studied^ The calculated coefficients will give an estimate of this c, restricting the possible values 
for the unknown c n . The set of uncertain variables that define the space on which we will create our 
credibility measure is thus the set constituted of this parameter c and of all the a priori unknown 
coefficients. 

at a given level), and should not be confused with the frequentist ones, which (see e.g. [11 ) do not in fact express a 
"level of confidence" . 

5 This hypothesis is of course known to be violated in practice, for instance by the factorial growth of the coefficients 
in the presence of renormalons. However, knowing that such a factorial growth typically only kicks in at fairly large 
perturbative orders, we can safely assume that our hypothesis holds true for low perturbative orders, which are the 
ones which are calculated in practice. Another instance in which our hypothesis is violated is in particular kinematical 
configurations (see e.g. [E]), or when new production channels open up at some higher order. 
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The model rests on three hypotheses: 

1. Residual uncertainty 

We suppose that, if we happened to know beforehand the parameter c, our residual density 
for the value of an unknown coefficient c n would be in the form of a uniform distribution, 

t( \-\ 1 / 1 if \Cn\ < C _ 1 , . 

/(Cn|c)= 2^ \ ifW>c = ^*M<^ ^ 2 °) 

where \A is the characteristic function of a set A. We could (and probably should) use a 
density function that does not vanish anywhere, like a Gaussian distribution, but the form 
(20) leads to simpler expressions, so we use it in the following to study the model analytically^] 



Shared information and independence 

The parameter c models information that we consider to be shared by all coefficients, and 
we make it the only one. When c is known, the residual uncertainties on the values of two 
coefficients c n and c n / are then totally independent. In fact, we will suppose the coefficients 
to be mutually independent, so that for a set of coefficients {q} we have 

f({c h iGl}\c) = Hf(c l \c). (21) 

iei 

The value of c is the maximal information that the coefficients share. It corresponds to 
the maximal knowledge one could extract from the known coefficients co, . . . ,c& in order to 
"predict" the possible values of unknown ones c n , n > k. 

Hidden parameter 

The value of the c parameter is "hidden" in the knowledge of the c n . As long as we have 
not calculated any coefficient, we can only say that it is a positive real number, and that all 
values for its order of magnitude are a priori equally probable. In order to implement this in 
practice we define a density for its logarithm as the limit of a uniform distribution between 
| lne| and — | lne| when a small parameter e tends to zero: 

/e(lnc) = — ^| X|lnc|<|lne| ^ fe{c) = ^ Xe<c<l/e ( 22 ) 

We will perform calculations (both analytical and numerical) using this e-dependent density 
f e with e / 0, and the final result will then be the limit e — > 0. The vanishing of a density in 
this limit would mean that we do not have enough information to make any guess about the 
result. For example, / e (lnc) tends to a "uniformly null" density, meaning that that when no 
coefficients are known we have no information whatsoever about the possible value of In c. 



The three hypotheses (20), (21) and (22) define completely the credibility measure over the 
whole space of a priori uncertain variables {c, cq, c%, . . . }. They then define every possible inher- 
ited measure on a subspace associated with a physical process whose first coefficients are known. 
Section [6] will revisit the choices made in building this model, for instance the choice of a uniform 
distribution for In c rather than for c (eq. ( |22[ ) ) , and study some alternatives and their consequences 
on the results. 

The following subsections are dedicated to deriving from these hypotheses the densities /(c|co, . . . , c&) 
/(cn|co, • • • , Cfc) for n > k, and finally the residual theoretical uncertainty of a perturbative predic- 
tion calculated up to order k, /(Afe|co, . . . , c^). 



6 We have checked that using a Gaussian distribution of mean zero and standard deviation c does not significantly 
modify the general behaviour of the results shown in this paper. 
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2.2.2 Conditional densities /(c|cq, . . . , Cfc) and f(c n \co, ■ ■ ■ , Cfc), n > k 



Using the three hypotheses eqs. (20), (21) and (22) and the properties of conditional densities, one 
can show that 



/(c|c , . . . ,c fc ) = (k + 1) 



(max(|c |, . . . , \c k \)) 



k+l 



Xc>max(|c |,...,|c fe |) 



and 



f(c n \c , • ■ • ,c fc ) 



1 k + 1 (max(|co| 



\c k \)T 



k+l 



(23) 



(24) 



2k + 2 (max(|c n |, |c |, . . . , jcfcl))^ 2 ' 
Let us derive for instance the second of these results. The conditional density for a generic 



(uncalculated) coefficient c n , n > k, is by definition (see eq. (19)) 

r I I x / £ (c ,...,Cfc,C n ) 

/ e (Cn Co, . . ■ , Cfc) = — — ; — . 

/ e (c , . . . ,C k ) 

As stated in the previous subsection, we perform all calculations with e 7^ and we take the e — > 



(25) 



limit at the end. From eq. (17) and the property of conditional densities, similar to eq. (19), we 
have 



fe(co, ■ ■ ■ , Cfc) = J fe(c , ■ ■ ■ , Cfc, c) dc 

/ e (c ,...,Cfc|c)/ e (c) dc. 



(26) 



Using the factorisation property (21) and the definitions (20) and (22) we get 

k 



,Cfc) 



/ (n/( C ^)j/e(c) dc 

/ n^, 

\i=0 



' - 5 i 2\h[7\5 Xe ^ 1 / t dc 



1 1 

2^+2 IhTel 



1 



dc. 



(27) 



A similar result holds for / £ (cq, . . . , Cfc, c„ 



/ e (co, . . . ,Cfc,Cn) 



1 1 
2^+3 Hn~d 



max(|co|,—,|c*|,e) 



1/6 



max(|cn|,|co|,...,|c fc |,e) 



dc. 



(28) 



We therefore can write, using eq. (25), 



/(c n |c , . . . ,Cfc) = lim 



/ e (c , . . . ,c fc ,c n ) lfe + l (max(|c |, . . . , |c fc |)) fc+1 



/ e (co,...,c fc ) 2fc + 2 (max(|c n |, |c |, . . . , \c k \)) k 



k+2 



(29) 



Note that in this equation the value k + l represents the total number of known perturbative 
coefficients Co, . . . , Cfc used to estimate c n with n > k, rather than simply one unit above the last 
calculated perturbative order k. Similarly, k + 2 is this total number plus one. If a series starts at 
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Figure 1: Densities /(c|co, . . . , C&) and /(c n |co, 
10 3 from the largest dashing to the solid curve. 



in the case c, 



(fc) 



1 for k = 0, 1, 5, 10 and 



a non-zero order a l s , its last known perturbative order k plus one will not give anymore the number 
of known coefficients. We will detail in section 0] the modifications to be made in this and in the 
following equations to account for such a case. 

The full derivation 



The derivation of eq. (29) is given in some more detail in appendix B.l 



of eq. (23) is given in appendix B.2 From now on we will collect the derivations of the densities 
and uncertainty intervals in appendix |Bj since we do not wish to focus on the technicalities of the 
derivation of the conditional densities but rather on their behavior. 

Defining crjA = max(|co|, . • • , |cfc|) we can rewrite the densities (23) and (24) as 



/(c|c , . . . ,c fc ) = (k + 1) 



and 



f{cn\cQ, 



k + 1 



k + 2 I 2c 



1 



"(*) 



Xc>c. 



(k) 



if Id 

If \Cr, 



<C( k) 
> C(jfc) 



(30) 



(31) 



: (k) [ (\cn\/c (k) )*+Z 

Figure [T] shows the two distributions f(c n \co, . . . , Cfc) and f(c\co, . . . , Cfc), plotted as functions of 
c n and c respectively, for different values of k, assuming that cha remains equal to one. cr^\ 
acts as an estimate of c, and the density f(c\co, . . . , Cfc) can be seen to tend to a Dirac distribution 
concentrated at c = cn^ when k goes to infinity. The more coefficients are known, the more precisely 
c is estimated. In the same k — > oo limit the density over the unknown c n tends to f(c n \c = crfy) 



as given in eq. ( 20 ) , the distribution that corresponds by construction to the remaining uncertainty 
when the whole of the hidden information simulated by c is known. For a finite value of k, the 
density is always wider than this limit: the uncertainty about unknown coefficients c n is larger when 
one knows the values of only a few coefficients Co, . . . , c& than when one posses the full information 
about the value of c. 



10 



2.2.3 Conditional density /(Afc|co, . . . , c k ) 

The remainder A/% of the perturbative series depends on the values of all the unknown coefficients 
Cfc+i> Ck+2, ■ ■ ■ ■ Its density can be written as 



/(A fc |c , . . . ,c fc ) = J 



S(A k - ^ a >r< 
n=k+l 



f(ck+i,Ck+2, ■ ■ ■ \cq, ■■■ ,c k ) dcfc+idcfc+2 • • • (32) 
This expression is too complicated to be handled analytically, even in the case of the simple choice 

(33) 



of density in eq. ( 20 ) for the coefficients. However, making the approximation 



A k ~ a k s +1 c k+1 , 



and using eq. (31 ) for f(c n \co, . . . , cj.) with n = k + 1, we obtain 

1 



/(Afc|c , • . . ,c fe ) 



1 



k + 1 



k + 2) 2a k + 1 < 



if \A k \ < a^ +1 ( 



(k) 



(k) 



(34) 



This result depends on the entire set of the calculated coefficients via the parameter cm = 
max(|c |, . . . , \c k \). 

The knowledge of /(Afc|co, ■ • • , c k ) allows one to calculate the smallest p%-credible interval for 
Afc. It turns out to be centred at zero, and hence we denote it by [— d^ , dj? ]. It is defined implicitly 

by 



f(A k \c , . . .,c k )dA k 



(35) 



and one finds, using the analytical approximation in eq. (34) (see appendix B.3) 

4 = { 

aJ+ 1 c (fc) P + 2)(l-p%)]- 1 /( fe+1 ) ifp%>|±i 
where, of course, p% = p/100 and p is a number between and 100. 



(36) 



The result for f(A k \co, . . . , c k ) in eq. (34) can be generalised to any choice of f(c n \c) and / e (c), 



i.e. beyond the choices of eqs. (|20|) and (22). Using the derivation given in appendix B.4 we obtain, 



still within the approximation of eq. (33), 

1 



/e(Afc|co, . . . ,Cfe) 



fe(CQ, ■ ■ ■ , Cfe) a 



1 



f(c \c)...f(c k \c) /(Cfe+l 



^\c)f e (c)dc, (37) 



where we have now explicitly allowed for the possibility of expressing intermediate quantities as 
a function of e (eq. ( |32[ ) was instead already written in the e — > limit). This expression will be 
used for the numerical evaluations of densities and credible intervals proposed in the Mathematica 
package available from the authors. 

Since the results in eqs. (34) and (37) were obtained using the approximation in eq. (33), we 
now wish to check it by comparing them to numerical estimates of the exact density (32). In order 
to do so we perform a numerical integration of eq. ( 32 ) , rewritten in the form 

/(Afc|co, ...,c k ) = j 



S(A k - a >n) 

n=k+l 



n 

.n=k+l 



f(c\co, c k )dcdc k+ idc k+2 ■ . . (38) 
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Figure 2: Numerical estimates of the exact densities /(Afe|co, • • ■ , c k ) (continuous curves) and their 



analytical approximations in eq. (34) (dashed curves) in the case cm = 1 for k = (left), k = 1 



(middle), and k = 2 (right), for a s = 0.5 (top row) and a s = 0.12 (bottom row). These numerical 
estimates are computed by integrating over the distributions for 10 unknown coefficients, the results 
being stable when using more. Using values of a s of the order of 0.2 or 0.3 does not degrade 
significantly the quality of the approximation seen here in the a s = 0.12 case. 



. ,Cfc) is given in eq. (30) and the f(c n \c) in eq. (20). Figure [2] shows the numerical 
0, 1 and 2 and the corresponding analytical approximation for /(A^|co, . . . , c k ) in 
eq. (34). We can see that the agreement is extremely good, especially when small (realistic) value 



where /(c|co, 
results for k 



of a s are used. We will therefore rely on the approximation of equation (33) for our predictions of 
densities for A k in the rest of this paper. 



3 Comparison with the conventional method 

In deriving the density for A k in the previous section we made no reference to the scale variation 
5 k of the partial sum a k (Q,fi) which is usually employed in the conventional uncertainty estimate 
[c7 , af] of section 



2.1 



In order to assess the compatibility of the two methods, we now wish to 
study the relation between the density for A k and an interval of the kind [cr^T, ]. 
Given a specific series and a set of coefficients (cq, . . . , Ck) we wish to evaluate 



C(A fc E [A^,A+]|c ,...,c fc ) 



/(A fc |c , . • • ,c fc ) dA h 



(39) 



and, for definiteness, we now take [a k as the interval given by eq. (|8j), so that we can set 

A fc = wm(a k (Q,Q/2),(T k (Q,2Q)) - a k = - a k 
A+ = max(a k (Q,Q/2),a k (Q,2Q)) - a k = a+ - o k 



(40) 
(41) 



Since the shape of <j k (Q,fj,), and therefore the values of A fc and At, depend on all the values of 
the calculated coefficients (cq, . . . , c k ), while the density function /(Afc|co, . • • , c k ) depends only on 
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Figure 3: Numerical estimates of the exact densities f(xk\c), where Xfc(co, . . . , c&) = C(A^ 
[Ar, Ai"]|co 3 . . . , Cfc) is the degree of belief of the scale variation interval, for c = 1, for k = 
(left) and k = 2 (middle) and k = 3 (right). Each plot is obtained using iV = 10 4 samples. 



their maximum ca.) (see eq. (34)), it is important to make sure that different sets of coefficients, 
sharing the same c, do not typically lead to broadly different estimates for the degree of belief 
C(A fc G [Ar,A+]|co,...,Cfc) ineq. pjk. 



To test this in practice, we evaluate the integral ( |39[ ) for many different configurations of the 
coefficients (co, • • • , c^), in the case c = 1. Figure [3] shows the distribution of the values of the 
degrees of belief that are obtained for the three perturbative orders k = 1, 2, 3. The typical degree 
of belief C(A^ G [A^T, AjJ"]|co, . . . , c&) predicted by the model can be seen to be largely unaffected 
by the precise values of the coefficients, its distribution taking an almost Dirac-delta shape at 
the values 0.57, 0.96 and 0.99 for k = 1,2,3 respectively. The peak at x% ~ in the rightmost 
plot can be understood as an artifact due to configurations where as(Q,2Q) and a%(Q,Q/2) are 
accidentally close to each other, resulting in a vanishing [Al^Ai - ] interval. It can be made to 
disappear by modifying the choice of the interval and using instead [Ar, AjJ"] = [rmn{crfc((5, fj,)} — 
<Tfc, max{<7fc((5, //)} — <7fc], i.e. corresponding to eq. ^ rather than eq. Q. 

The numerical results of figure [3] can be understood through the following analytical approxi- 
mations. First, we modify slightly the interval [AjjT, AjJ"] considered above. We make it symmetric 



around Ok and of width 5^/2, where 5k is given in eq. (13), and we consider 



C(A fc G [A^,A+]|c ,...,c fc ) ~C(A fc G [- 



' 2 ' 2 J 



CO, • • • ,c k ) 



(42) 



Using eq. (34), we get (see appendix B.5) 



C(A fc G 



2 ' 



4, 

2 J 



I c , • • • ,Cfc) 



1 

fc+2 



2 c (fc) 
3fc/3o |c fc | 

3k/3 |cfc| 

C(fc) 



fc+1 



if ^ 

11 2 



> a 



k+l 



r^ik) & |Cfc| > 3^C (fc) 



if 



f < c (fc) |c fc | < 3 W c (fe) 



fc+2 2 

(43) 

This result is fully independent of the coefficients in the approximation (or in the case) where 
Cfc = cVfc) . It predicts the Dirac distribution-like shape observed in figure [3] and the variation of its 
position with the value of k. For k = 1, k = 2 and k = 3, it gives for the degree of belief (43) the 
values 61% (using the lower expression in eq. p3|)), 96% and 99.6% (using the upper expression 
in eq. (43)) respectively, using /3q = 0.61 and Ck = cqa. These values are in good agreement with 



those obtained from the numerical estimates of the exact densities in figure [3j The /c-dependence 
of the result in eq. (43) shows that the degree of belief of the interval [a^ — 5fc/2, &k + <W2] is 
not a constant, but depends instead on the perturbative order at which we are working. When 
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calculating higher orders in a perturbative series not only the size of the conventional residual 
uncertainty decreases, but also its degree of belief, as evaluated by our new method, increases. 
Note also that this method of evaluating the degree of belief of an uncertainty interval avoids one 
specific shortfall of the conventional method, namely that its estimate of the theoretical uncertainty 
may become unreasonably small if the last calculated coefficient happens by accident to be much 
smaller than the others or if accidental cancellations take place. 



It would be tempting to consider eq. (43) as the main results from our Bayesian model, allowing 
one to associate a degree of belief to the uncertainty bands given by the conventional method. The 
simplicity of these equations, and their numerical values tantalisingly (though entirely accidentally) 
close to the confidence levels of Gaussian sigmas, make them apparently good candidates for such 
an identification. However, it is important to bear in mind that these equations depend on the 



choice made for the density function in eq. (20), as well as on the various approximations made 



in deriving them. As such, they cannot be considered as strictly valid in general, although they 
offer a very useful first approximation when trying to gauge the degree of belief of a conventional 
uncertainty band generated by scale variations. 

In practice, one would like to be able to abandon the scale variations method altogether, and 
determine the degree of belief of any interval of his choosing. In general we will therefore not 



use eq. (43), but rather estimate any desired p%-credible interval numerically using the density 



function (37), without any reference to the conventional method. 

4 Series starting at non-zero order a l s 

Oftentimes, one may wish to consider a perturbative series starting at a non-zero order in a s , 

oo 

a = Y,c n a n s . (44) 



n=l 



When this is the case, only k + 1 — I coefficients (rather than k + 1) are calculated when the series 
is known up to perturbative order k. The results of our model given in the previous section should 
then be modified as follows. 



Eqs. (23) and (30) become 



\ (max(|q|,...,|c fc |))™ c c (k) . , 

/(C|Q, ...,C k )=fl c — x Xc>(max(| Ci |,...,|c fc |)) = n c Xc>c (k) (45) 

where we have introduced the number of known coefficients, 

n c = k + 1 - I . (46) 

Note also that cr/A should now formally be defined as max(|q|, . . . , |cfc|). We have not changed its 
notation so as not to proliferate the number of different symbols. 



Eqs. (24) and (31) are similarly modified to 

| v 1 n c (max(|q|, . . . , |c fc |)) nc 

J[,c n \ci,...,Ck) 2n c + l (max(|c n |,| Q |,...,|c fc |))^+i 



l — i 1 * \ Cn \ 7 C > • (47) 

n c + lj 2c (fe) I (| Cn |/g (fe) )« c +i 11 l c «l > c (k) 
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Eq. (34) becomes 



Jfc+l; 



/(A fe |q, . . . ,c fe ) 



n c + 1 



2a s c( fc ) 



(|A fc |/(aS +1 c (fc) ))«c+i 



if |A fc | < a^'Cffcj 
if |A fc | > a k+1 c {k) 



(48) 



and from this one derives the result corresponding to eq. (36) for the width of the smallest p%- 
credibility interval: 



Ap) 



< 



a 



if . 

k+1 c {k) [(n c + !)(!- p%)]" 1/nc if p% > 



n c +l 



n c +l 



(49) 



Finally, using the result in eq. (13), S k ~ 3k(3oa k+l \c k \, which is unmodified by the fact that the 
series starts now at order I, one finds that the degree of belief associated to the interval given by 
the conventional scale- variation method, already given in eq. (43) for the I = case, is modified as 

2 



C(A fc G[-|,|]|q, 



.CfcJ 



1 



l 

n c +l 



2 c (fc) 
3fc/3 |c fe | 



if % > a k+1 c (k) \c k \ > 3^c (fc) 



n c 3k/3 \c k \ 



n c +l 



C(fe) 



if 



2 < |Cfc| < 3^C (fe) 



(50) 

For a process starting at order a s (i.e. Z = 1) this equation predicts a degree of belief of 46% 
at LO (k = 1), using the lower expression in eq. Q, 90% at NLO (k = 2) and 98.8% at NNLO 
(k = 3), using in both cases the lower expression in eq. (50) and c k = ci k y For a process starting 
at order a 2 s (i.e. I = 2) one predicts instead a degree of belief of 73% at LO (k = 2), 96% at NLO 
(k = 3) and 99.5% at NNLO {k = 4). In this case the upper expression in eq. ([50]) always applies. 



5 A realistic application: e + e — > hadrons 

The total cross section a(e + e~ — > 7 — > hadrons) is one of the best known observables in perturbative 
QCD, its coefficients being known exactly up to order a^, and even C4 being known approximately. 
This process is therefore an ideal place where to test the behaviour of our Bayesian model, and 
compare it to the results of the conventional uncertainty estimate. 
We write this cross section as 

4 

a 4 (Q) = a (Q)(l + J2c n a^(Q)) (51) 

71=1 

and, for nt = 5 massless flavours, we have (the values are reviewed for instance in |13j ) 

ci = 0.31831, c 2 = 0.142785, c 3 = -0.412969, c 4 ~ -0.821356 (52) 

These coefficients leads to following partial sum^Jfor (JQCD,k{Q) = &k(Q) / &o{Q) — 1 and a s (Q) = 
0.118: 

&qcd,i = 0.0375606, a QCD ,2 = 0.0395487, a QCD ,3 = 0.0388702 (53) 

7 Note that in |13] <tqcd is denoted by Sqcd instead. We have modified the notation to avoid confusion with our 
own definition for 5k, which represents an uncertainty interval rather than a value of the truncated series. 
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Figure 4: Comparison of the uncertainty intervals for the e + e~ —> hadrons process, as given by the 
conventional method of scale variations (first interval on the left of each group) and by our model 
(the latter for two different values of degree of belief, 68.3% and 95.5%, respectively middle and 
right of each group), for a s = 0.118. We have used the definition (TQCD,k(Q) = &k(.Q) / a o{Q) — 1- 



where we have dropped for convenience the argument Q. One can now apply the conventional 
uncertainty estimate method of section 2.1 Using for definiteness the convention in eq. ([9]), one 



finds 

[*WD,v°%CD,i] = [0-03401, 0.04197] (54) 

Wqcd,%> "qcdj] = [0-03871, 0.03980] (55) 

\°qcd,v °qcd, 3 ] = [0- 03855 > 0- 03893 ] (56) 

One can compare these "uncertainty intervals" with the successive perturbative results given above, 
and see that indeed order by order the higher-order result is inside the intervaj^J 

This is as far as the conventional uncertainty estimate can go. At this point one can use our 
model to do one of two things (or both): either one calculates the degree of belief of the intervals 
given above, or one finds the intervals corresponding to given values of degree of belief, for instance 
68.3%, 95.5% and 99.7%. 

For the first option, we find 

C(*QCD G Wq C D,V^QCD,M = 45 - 8% ( 57 ) 
C{(Tqcd e [ct QCDi2 ,(Tq CDi2 \\c 1 ,C2) = 58.4% (58) 

C{(Tqcd 6 [o-QCD,3' CT QCA3]l Cl ' C2 ' C3 ) = 77 - 2% ( 59 ) 



8 These uncertainty intervals have been evaluated by using in all cases an evolution equation for a B up to fa, i.e. 
what is needed for <jqcd,3- One may have used lower-order accuracies when dealing with (jqcd,i or cjqcd,2, but we 
have explicitly checked that this changes at most the last significant figure in the numbers given above. 
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These values have been obtained via numerical integration of the density /(Afe|ci, . . . , c&) in eq. (48) 



One can compare them to the values given by the analytical approximations in eq. ( 50 ) for I 
k = 1, 2, 3 respectively, i.e. n c = k+l—l = 1, 2, 3. One obtains 45. 



54. 



and 98 .f 



1 and 
i respectively. 



The first two values (both obtained with the lower expression in eq. (50)) can be seen to be in good 



agreement with the exact results. The big discrepancy for the third one can be explained with the 
fact that the actual interval \a7%nr> i, Pnnn q1 happens to be quite asymmetric with respect to the 



'QCD,V U QCD,31 



central value (Tqcd,3, whereas in the approximation (50) the interval [—5^/2,5^/2] is symmetric. 
This serves as a remainder that one should always resort to the full numerical evaluation of the 
degree of belief whenever accurate results are sought. 

For the second option, we find, using the notation to denote the minimal p%-credible interval 



of the form [ak — d 



(p) 



where is defined implicitly in eq. (35) (see also appendix B.3), 



and 



,(68.3) 

,(68.3) 
'2 

,(68.3) 
'3 



c (95.5) _ [_ .oo23475, 0.0774687] (63) 

[0.0376789, 0.0414185] (64) 

[0.0387297, 0.0390107] (65) 

These intervala^jcan then be compared to those returned by the conventional method in eqs. fl54|55|56| ) 
This comparison is given in graphical form in figure U One can see how the 68.3%-credible intervals 
are not too dissimilar from those predicted by the conventional method of scale variations. It is 
worth noting how the former tend to become smaller than the latter as the perturbative order 
increases, pointing to a potential overestimate of the theoretical uncertainty by the conventional 
method at higher orders. 



[0.0307747, 0.0443465] 
[0.0390012, 0.0400962] 
[0.0387973, 0.0389431] 



(60) 
(61) 
(62) 



(95.5) 
2 

(95.5) 



6 Discussion about the hypotheses of the model 



Our model was built making the choices in eqs. ( p0"| ) and (22) for the densities f(c n \c) and f € (c). 
We made there the choice of using a flat prior for lnc (rather than c itself) in eq. (22), and for c n 
instead in eq. (20). We discuss below the reasoning behind these choices. 



6.1 Choice of the density function f(c n \c) 

The choice of exactly what variable to use to express a prior density, e.g. the logarithm of a 
parameter rather than the parameter itself, is related to the assumed nature of said parameter. 



9 These numerical values have been calculated discarding the coefficient of a®, on the ground that it controls an 
exclusively electroweak process, and therefore it should not have a say on the size of the coefficients of a perturbative 
expansion in a s . They differ very slightly (at the level of the third/fourth significant figure) from those that can 
be obtained using eq. (491, since they have been calculated by integrating numerically the credibility distributions. 



Note that in the definition of the model (and therefore in the numerical evaluation of the credible intervals) one may 
employ for /(c„|c) a form which differs from the uniform step function given in eq. (201, for instance a Gaussian 
distribution of mean and standard deviation c. We have checked that in this case the intervals are not modified in 
a significant way, showing that the results of our model are robust with respect to reasonable variations of this initial 
hypothesis. 
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Suppose we used In c n instead of c n in eq. ( 20 ) and denned a uniform density 



f(\llC n \c,h) = -^X\nc~h<\nc n <\nc+h (66) 

where h is an arbitrary parameter. This means to consider it as likely to find c n between c/exp(/i) 
and c as between c and cexp(/i). One can debate whether this behaviour is more or less appropriate 



than the one used in eq. (20) where c n is equally likely to lie between — c and zero as between zero 



and c. However, the main drawback of eq. (66) is that it requires the introduction of a new, a priori 
unknown, parameter h which controls the spread of the coefficients. At least three perturbative 
coefficients would then need to be known before the model can estimate a credibility interval. 



We have therefore concluded that the hypothesis in eq. ( 20 ) not only already describes sufficiently 
well the observed typical relations between perturbative coefficients, but also provides the simplest 
model (simplicity being a strong guiding principle of our model, as we wish to be able to control 
well the information we introduce into it). 

6.2 Choice of the density function /(lnc) 

The value of the sum of a perturbative series depends on the value of c. Choosing a density for c 
which is uniform in c itself rather than in its logarithm amounts to trying to predict the precise 
value of such a series rather than just its order of magnitude. We find the former too strong a 
constraint, and prefer therefore to limit ourselves to the second choice. On a technical side, we also 
find that when using a prior uniform in c one then needs at least two calculated coefficients in order 
to have a non-null density on its theoretical uncertainty, whereas in the In c case one coefficient is 
already sufficient to give an indication about the order of magnitude of the higher order coefficients 
and therefore about the remainder of the series. 



Note also that it is sufficient to use in eq. (22) a density / e (lnc) that is uniform in lnc only in 



the e — > limit. For finite values of e this requirement is not necessary. 
6.3 Choice of the expansion parameter 

Another modification of the model would be to use an expansion parameter that differs from a s , 
so that 

( T = ^c n < = ^(A"c„)(^) n (67) 
This corresponds to a redefinition of the coefficients c n into 

c n = A"c„ (68) 



and the density function in eq. ( 20 ) would now be defined by 



f(c' n \c') = ^_X\c' n \<c> (69) 

where c' is a parameter that applies to the new set of coefficients c' n . 

This choice would not modify the expressions for the densities over the unknown coefficients 



/(c'Jc'q, . . . , c' fc ), but only the one over the residual sum /(Afc|c' , . . . , c' k ) since the approximation (33 ) 
would now read 

/ n \ k+1 

A, - 4 +1 (^) (70) 
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so that eq. ( 34 ) is replaced by 



,. ,k + l\ 1 f 1 if|A fe |<(« s /A)^< 

/(A fc |C ,...,C fc j- — — S . 1 if IA.J -> (n,J\\ k 



(k) 



(71) 



fc + 2 J 2(a s /A) fe -% ) \ (|A h |/((a./A)*-n e > w ))^ if 1^1 > («•/*)*%) 

where now c'^ is of course the maximum of the new known coefficients c' n . 

Different values for A relate to different speeds of convergence of the series, either quicker (A > 1) 
or slower (a s < A < 1). Of course one must be careful not to end up with an expansion parameter 
which is too large, because this will eventually invalidate the use of the approximation in eq. (33). 



7 Partially known higher orders 

The model we have considered so far assumes perfect knowledge of some coefficients, up to order k, 
and total ignorance of those of higher order. In practice, it is often possible to know part of a higher 
order coefficient, typically calculated within some approximation or obtained as an expansion of an 
all-order resummation. It is yet straightforward to extend the model to account for such cases. 

Two new building blocks are required to adapt the model. First of all, if c k+ \ is an approximation 
of Cfc+i it should not provide more information than the true value Ck+i itself. If the real value 
Cfc + i is known, knowledge of the approximate value c k+ \ must not change anything: for a set of 
coefficients {cj} it must hold 

f{{ci}\c k +i, Cfc+i) = f({ c i}\ c k+i) , (72) 
f(c\c k+1 , Cfc+i) = f(c\c k+1 ) , (73) 
/(c, {ci}\c k+1 ,c k+ x) = /(c, {ci}\c k+1 ) . (74) 

Secondly, one must decide how reliable a given approximation c k +i of c k +i is. We must introduce 
a density function f(c k+ i\c k+ i) for the value c k+ i, given the true c k+ \. The choice of this density will 
depend on the way c^+i was obtained. One possible parametrisation is for instance the log-normal 
density 

,,~ | v 1 1 ( (ln(c fc+ i/c fc+ i)) 2 ' 

f{c k+1 cfc+i) = p - —=—- exp — 

|cfc+i| V27rln/ V 2 (W) 

for some chosen value of the parameter /. It more or less corresponds to c k+ \ estimating c k+ \ up 
to a factor of order /. 

The densities on the true value of the coefficient c k+ \ which is known only approximately, and on 
the completely unknown coefficients c n can then be written, up to normalisation factors collectively 



denoted by M, as (see appendix B.6) 



/(c&+iJco, . . . ,Cfc,c fc+ i) = Aff(c k+ i\co, . . . ,c k )f(c k+ x\c k+ i) (76) 
f(c n \c ,. ■ ■ ,c k ,c k+1 ) = M / f(c n ,c k+1 \c ,...,c k )f(c k+ i\c k+ i) dc k+ i (77) 



More generally, for arbitrary sets of known coefficients Ck = { c i}ie[o,fc]> approximations Ca = 
{5i}i£A of coefficients Ca = {ci}i&A and totally unknown coefficients Cn = {ci}i^N we can write 

f{C N ,C A \C K ,C A ) =M f(C N ,C A \C K ) f{C A \C A ) (78) 
To get the density for the unknown coefficients, one then just integrates over Ca- To get the density 



over A k one replaces eq. ( 78 ) in the definition ( 32 ) . 
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Let us study for instance the case of one known coefficient cq and one approximation c\. We 
want to obtain the density /(Ao|co, c"i). Depending on how much we trust the approximation, 
the uncertainty over c\ or c 2 will predominate. In general, we need to keep track of both these 



coefficients in the expression for Ao- We do not use the approximation (33) but rather the more 
accurate one 

A ~ c\a s + c 2 a 2 s . (79) 

The result for the density is then 

/( — |c ,ci) ~ /(ci + c 2 a s = — |c ,ci) (80) 
a s a s 

The expression for f{c\ + c 2 a s = x\co, c\) can be obtained as 

/(ci + a s c 2 = x\co, ci) = J 5(x- (ci + a s c 2 ))/(c 2 , ci|c , c x ) dc 2 dci 

= J f(C2,Cl = X-a s C2\cQ,Cl)dc2 

f(c 2 , cx{x, c 2 )\c )f(c 1 \c 1 (x, c 2 )) dc 2 (81) 



where we defined c\(x,c 2 ) = x — a s c 2 to simplify the expression and we used equation (78). The 



result for /(c 2 , c\(x, c 2 )|cq) is obtained in a similar way to /(c n |co, . . . , c&) in (31 ): 



fl r \\ \ v /e( C 2,Cl(x,C 2 ),C ) - . 

f{c 2 ,ci(x,C2) co) = lim rr - 82 

^0 Je(co) 

Using eq. (27) for k = and k = 2 we find 

f(ci + a s C2 = x\co,ci)=J\f[ ( _ 1 ) f(ci\ci(x,c 2 )) dc 2 (83) 

y V c (2)(^)/ 

where we have defined c^ 2 )( x ) = niax(co, ci(x, c 2 ), c 2 ). 

In order to see how this works in practice, we choose cq = 0.9 and c\ = 0.83 and we plot eq. (83 ) 
as a function of x = Aq/o s for various values of /. Figure [5] shows how the density is modified when 
ci is more and more trusted (/ is smaller and smaller). Consider at first the top left plot. When the 
approximation is not much trusted and / is large (/ = 50 in this case), the only useful information 
that we can get from ci is the sign of c\. The total uncertainty over Ao/a s ~ ci + a s c 2 is largely 
dominated by the uncertainty over c\. Its density coincides with two times the density /(ci|co) for 
positive values of x and with /(ci|co,ci). The differences, originating from the uncertainty over c 2 
or from the small information provided by the particular value of ci, are almost negligible (except 
around zero). 

As ci gets more trusted, the uncertainty decreases and the degree of belief for Aq/o s to have 
a value around ci increases (top middle plots). The uncertainty over ci is still the limiting one 
but the information provided by ci is not negligible anymore. The full density still coincides with 
/(ci|co,ci) but is starting to be different from /(ci|co). 

Then comes a limit where the uncertainty over ci is of the same order as the one over a s c 2 
(top-right and bottom- left plots. They are the same plot, but shown for different scales on the 
x-axis). The full density over ci + a s c 2 now differs from /(ci|co, ci) and its width is given both by 
this density and by the one over /(c 2 |co, ci = ci). 

As ci is considered to be a better and better approximation, the uncertainty over c 2 prevails. 
The difference between ci and ci is now negligible and the total density approaches the shape of 
/(ci + a s c 2 |co, ci = ci) (bottom middle and right plots). 
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Figure 5: Density /(ci + a s C2\co,ci) ~ f(-zr-\co,ci) (solid curve) for cq = 0.9, c\ = 0.83, a s = 0.12 
and a log-normal density /(ci|ci) of width parameter In / with / = 50 ; 2 ; 1.5 ; 1.2, from left to 
right in the top row, and / = 1.2 ; 1.1 ; 1.05 ; 1.01, from left to right in the bottom row. The 
two dashed curves represent 2/(ci|co) — 2/(-^|co) (flattest dashed curve) and f{c\ + a s C2\cQ,c\ = 
ci) — f{-^-\cQ,c\ = ci) (most peaked dashed curve). They correspond to the limits when c\ is not 
trusted at all or when it is completely trusted. The last, dotted curve represents /(ci|co,ci). It 
coincides with /(— |cq,ci) until the uncertainty over c\ is the limiting one. 



8 Conclusions and outlook 

In this paper we have introduced a Bayesian model which allows one to characterise in terms of 
intervals of a given degree of belief (or degree of belief of a given interval) the residual theoretical 
uncertainty of a perturbative calculation. Our aim is to put on more solid ground the estimate of 
the uncertainty of a known result, not to improve in any way the calculation itself. This we try to 
achieve by formalising hypotheses on the behaviour of the coefficients of perturbative series, and 
then by deriving from these hypotheses the degree of belief values in a rigorous way. 

We have chosen to try to translate as closely as possible into our model the assumption which 
is implicitly made when employing the conventional method (scale variations) for estimating the 
uncertainty, namely that successive coefficients of a perturbative series tend to have similar size. 
One may or may not believe this hypothesis to be well grounded, and our choice is not necessarily 
true or even just the best possible one. However, what matters, and what this paper wants to 
provide, is not so much which hypotheses are made, but rather the formalism that allows one to 
derive from them a proper characterisation of the residual theoretical uncertainty: our framework 
can then be considered as a box into which to input one's favourite hypothesis about the behaviour 
of a series, and from which to extract the appropriate degree of belief values. 

We have found that, under the quite gener al assumption mentioned above and within the 



Bayesian framework formalised in section 2.2.1 the p%-credible interval [a k — du ,<Jk + dfj? ] of 



a series calculated up to order k, 

o-fc = c t a l s + ■■■ + c k a k s , (84) 

with / > and k > I, is given by 

a^ 1 max{|q|, . . . , \c k \}^p% if p% < ^ 

4 P) = . > (85) 



a k s +1 max{| Q |, . . . , | Cfc |} \{n c + 1)(1 - p%)]" 1/nc if p% > 



He 



n c +l 
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with n c = k + 1 — I the number of known perturbative coefficients. The full credibility distribution 
can also be obtained, and is given in section 2,2.3| 



In the calculation of QCD corrections to a simple process like e + e~ — > hadrons we see that the 
intervals given by the conventional renormalisation scale variation are not too dissimilar from the 
68.3%-credible intervals given by our Bayesian model. These findings, detailed in section [5] and 
shown in graphical form in figure |4j are perhaps not surprising: the conventional method itself 
has been built and refined over the years into a form that often returns results compatible with 
the calculation of successive perturbative orders and with intuitive expectations, and the same 
hypothesis that it makes implicitly we have made explicitly. Nevertheless, within our method one 
can now state a precise interval and in addition a detailed degree of belief for it (and possibly bet 
on it). A Mathematica package implementing the results of this paper is available from the authors. 

Obviously this is not the final word in terms of a rigorous characterisation of theoretical uncer- 
tainties. We have chosen a very simple process, and we have found a nice self-consistent picture. 
However, much more work remains to be done in order to extend the method to more complex pro- 
cesses. For one thing, one may wish to accommodate also the presence of a factorisation scale, and 
therefore additional ingredients like parton distribution or fragmentation functions. Secondly, our 
Bayesian model as it is now formulated inevitably fails when the behaviour of a physical process is 
known not to be self-similar to all orders: an obvious example is a process for which a new produc- 
tion channel opens at some perturbative order, or for which a particular kinematical configuration 
is selected. In such cases, an extension of our hypotheses is obviously called for. We are looking 
forward to exploring these new avenues in the future. 
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A Partial cross section, coefficients and renormalisation scale 

Let 

oo 

^) = ^QW,MKM (86) 
i=0 

be the total sum of a perturbative series of expansion parameter a s , which evolves according to 

, oo 

^=fi(a.) = (87) 

M 3=0 

where B(a s ) is the beta function, [i is an arbitrary renormalisation scale, so that it holds 



d x / d v 

dlnu 2 ^ \ dln/i 2 3 s ^ ' 

r i=o \ r j=o 



■ 3 ac a 



i+2 



i=0 



da 



dln/x 



2 



i-1 

^jBi-t-jCj 

3=0 



(88) 



so that 



da 



i-1 



din u 2 

p 3=0 



Y^jPi-l-jCj Vi>0 (89) 
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k 


i = 1 


i = 2 


i = 3 


i = 4 


1 


0.61 








2 


0.24 


1.22 






3 


0.07 


0.49 


1.83 




4 


0.19 


0.15 


0.73 


2.44 



Table 1: First values of iftk-i, calculated with nf 
coefficients of the beta function. 



5. The first column gives the first four 



This equation already tells us that the /^-dependence of the coefficient is controlled by the lower- 
order coefficients Cj, j < i — 1. Moreover, it shows that cq and c\ are independent of /a. It is then 

2 

straightforward to conclude that Cj is a polynomial of degree less than or equal to i — 1 in In ^ '■ 



In 



El 
Q 2 



Vi > 1 



(90) 



This is true for % = 1 since the derivative of ci with respect to \x is zero. Assuming that it is true up 
to some i, eq. (89) shows that d ^\ is a polynomial in In ^ of degree less than or equal to i — 1, 
which makes Cj+i itself a polynomial of degree less than or equal to i. 

2 

Rewriting eq. ( 89 ) order by order in In ^ , we can also obtain a recurrence relation giving the 



values of all a i in terms of the c 



Cj(Q, Q), j < i, and the /3j: 
1 1-1 

J^jPir-l-jCjJ-l 



I 



(91) 



3=0 



Hence, given the calculated values for Ci(Q,Q) one can easily reconstruct the full renormalisation 
scale dependence of the coefficients and of the partial sums = Yli=o c i a \- Once again, note that 
only the coefficient Ci(Q, Q) = c^o needs to be explicitly computed at each order. 

2 

Finally, we give the expression of the derivative of a partial sum <jfc with respect to In ^ . From 
and (89) we get 

, oo k 

d(T fc 



eqs. 



din /i 2 



i=k+l j=0 



(92) 



showing that, as expected, the residual scale dependence of is of higher order a^ +l . If we now 
consider that a s <C 1, observe that the known coefficients are such that /3j < (3q (see table [T]), 
and assume all the |cj| are of the same order, we can approximate this result as 



do- fc 
dln/x 



2 — s 



(93) 
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B Derivations of density distributions and uncertainty intervals 



B.l Derivation of f(c n \co, . . . , c&) in eq. (24) 



We wish to derive derive eq. (24). We first compute the density f e (c n \co, . . . ,Ck) for e ^ 0. Using 
the definition of conditional density, we have 

/ e (c w |co,...,c fc ) = / f.°'---' Cfc ' C " ) Vn>k (94) 

f e (C0, ...,C k ) 

Both densities / e (co, . . . , Ck) and / £ (cq, • • • , c^, c n ) are obtained from the independence hypothesis 



(21) and the expressions of the densities of the model (20) and (22). We have then 



/e(co, • • • ,C fc ) 



/e(C(), ... ,Cfc,c) dc 

/ e (co, . . . ,Cfe|c)/ e (c) dc 
k 

Ci\c) f e (c) dc 



i=0 



/ n 2c X|ci 



,i=0 
1 1 



1 1 

rTj i T ^Xe<c<l/e 

2 me c ' 



dc 



1 



I me | ./max(|c |,...,|c fe |,e) L 



dc. 



A similar calculation gives for f e (co, . . . ,Ck,c n ): 

ft \ 1 1 

/ e (co, . . . ,c k ,c n ) 



1A 



2 fc +3|lne| 7 max (| C0 |,...,| Cfe |,| Cn |, e) c fe + 3 
We can then obtain the expression for the density f t (c n \co, . . . , c^) 

/e(Cn|co,...,Cfe) = 

and, going to the limit e — >• 0, 



dc. 



_J^fg-(fe+2)llA 

fc+2L c Jmax(|c |,...,|c fe |,|c n |,e) 



'k+1 I 



-(fc+l)! 1 /^ 
J ma: 



max(|c |,...,|c fc |,e) 



/(c n |c , . . . ,Cfc) 



1 fc + 1 max(|co|,...,|c fc |) fc +1 

2 fc + 2 max(|c |, . . . , |c fc |, |c r , 



^+2 ' 



(95) 



(96) 



(97) 



(98) 



Note that in this equation the value k + 1 represents the total number of known perturbative 
coefficients cq, . . . , cjt used to estimate c n with n > k, rather than simply one unit above the last 
calculated perturbative order k. Similarly, k + 2 is this total number plus one. If a series starts at 
a non-zero order a l s , its last known perturbative order k plus one will not give anymore the number 
of known coefficients. We detail in section 0] the modifications to be made to account for such a 
case. 
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B.2 Derivation of f(c\co, . . . , c k ) in eq. (23) 



The result (23) is obtained in a way similar to the one discussed in appendix B.l We find 

/ £ (c, c , . . . , Cfc) 



/e(c|c , . . . ,Cjfc) 



/ e (co, . . . , Cfc) 
/ e (c, CQ, ■ ■ ■ ,Cfc) 
J/e(c,C , . . . ,Cfc) dc 

/ e (c , . . . ,c fc |c)/ £ (c) 

/ /e(co, • • .,Cjfe|c)/ e (c) dc 
_J 1_ 1 

2 fc + 2 |lne| Amax(|c |,...,|c fe |,e)<c<l/e 



*+2 Xmax(|co|,...,|cfe|,e)<c<l/e 



dc 



[kTij 

Taking the limit e — )■ and using the notation cr k \ = max(|co|, . . . , \c k \) we get 



f{c\CQ, 



,Ck) 



(k+1) 



Ifc) 



C(fc) 



(99) 



(100) 



B.3 Derivation of the smallest p%-credible interval in eq. (36) 



The density /(A&|co, . . . ,c k ) in eq. (34) is symmetric for negative and positive A k , and decreases 
monotonically from A k 



to infinity (see figure [2]) . The smallest interval of fixed 
belief, which we denote by [—d k p \ d k p ^], will then also be symmetric. 



degree of 



Two cases apply. With p sufficiently large, this interval will extend beyond the [—a k+1 c^ , a k+1 c^] 



range, so that the density's expression in eq. (34) can be simplified as 

'k + 1 



/(Afc|cQ, . . . ,Cfc) 



1 



1 

k+1- 



k + 2 J 2a k+1 c {k) {\A k \/{a k+1 c {k) )) k +^ 



for A k g[-d k P >,d 



(p) j(p)i 



(101) 



Noting that the degree of belief outside of [-d[ p) , d ip) ] is 1 - 
are symmetric, we have 



and that the interval and the density 



1 



/(A fc |c , . . . ,c k )dA k 



k + 1 



J>> \k + 2j 2a k+1 c {k) (\A k \/(a 



k+2 



:dA k 



k + 1 
k + 2 

k + 1 
k + 2 



(a k s +1 c {k) r 



k+1 



f„k+l- \k+l 
( a s c (k)) 



dA k 
& A*+ 2 



fc+i 



k + 1 



d 



(P) 



From this we obtain 



d 



(v) 



(102) 
(103) 



• fc -a k+1 c {k) [(k + 2)(l-p%))~y^ 

If, on the other hand, the interval is smaller than the [— a k+1 c^, a k+l c^] range, it is the upper 
expression in eq. ( 34 ) that enters the calculation of the degree of belief. One finds 



,(p) 



f(A k \co, . . .,c k )dA h 



,(p) 



k + 1 



W\k + 2J 2a k+1 c 



1 



-dA k = d 



(p) 



3(fc) 



k + 1 

k + 2 



a k+1 c {k) 



(104) 
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so that 



(105) 



One can see that the first case, leading to eq. (103), applies for p% > whereas the second 
one, leading to eq. (105), holds for p% < 



B.4 Derivation of the approximate /(Afc|co, . . . ,c k ) in eq. (37) 
From eq. (32) and making the approximation Afc ~ a k+1 c k+ i we get 



/(A fc |co, ...,c k ) = J 



5(A k - J2 c 



n a s ) 



n=k+l 



] [ f{c n \c) /(c|c , . . . , c fc ) dc dc k+1 dc k+2 . . . 



n=k+l 
oo 



5(A fc - c fc+ iaJ +1 )J /(c„|c) /(c|c , . . . , c fc ) dc dc k+1 dc k+2 ■ . . 

n=k+l 

J S(A k - c k+1 a k s +1 )j f{c k+1 \c) f(c\co, . . . ,c k ) dc dc fc+ i 

-TTT / /(Cfc+l = ^TTt\c) /(c|c , . . . ,Cfc) dc 



(106) 



We can reinstate explicitly the e-dependence in the equation above, and rewrite the density /(c|co, . . . , c k ) 
in terms of the elementary densities of the model, so that the resulting expression can be used with 
any density distributions. We obtain 

1 



/e(Afc|c , . . . , Cfc) 



fc+1 



a 



/(Cfe+l 



A fc / e (co,...,c fc |c)/ e (c) 

C — : QC 



a 



k+l 



1 



1 



fe{CQ, ■ ■ ■ ,C k ) 



k+l 



/(cfeH 



/ e (co, ... ,Cfc) 



4+ 



T |c) /(co|c).../(c fe |c)/ e (c) dc (107) 



Under this form, the evaluation of / e (Afc|co, . . . ,c k ) can be performed numerically with e 7^ 0. 



B.5 Derivation of the degree of belief of the scale variation bands in eq. (43) 



The result in eq. (43) for the degree of belief of an interval 



2 ' 2 



can be easily obtained by 



recalling the derivation in appendix B.3 of eq. (36) and inverting the final result, i.e. expressing the 
degree of belief as a function of the interval width rather than viceversa. One easily obtains 



C(A fc G[-|,|]|c , 



, Cfc) 



fc+2 



fc + 1 - 
a 3 C(fc) 

S k /2 



1 



if % > a 



k+l; 



and, using the result in eq. (13), 
we get 



C(A fc G [-y;y]|c ,...,c fc ) 



fc+1 4/2 

k + 2 a k s + \ k) 



5 k ~?>kp Q a k s +1 \c k 



if % < a^ +1 c, 



(fc) 



(108) 



(109) 



fc+2 



2 c (fc) 
3fc/3 I c fc I 



fc+1 



if % > C^ +1 C (fc) ^ I Cfc I > 



fc+1 3fc/3 M 
fc+2 2 c (fe) 



if 



< a 



fc+i 



s C(fc) 4^ |Cfc| < 3^C(fc) 



;no) 
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B.6 Derivation of f(c n \c , . . . , c k , Ck+i) in eq. (77) 



Let us first derive the expression of f(c k +i\co, . . . , c k , c k +i). 

/e(co, • • • , Ck, C k +l,C k +l) 



/e(Cfe+l|co, • • • , Cfc, Ck+l) 



/ /e(co, • • • , Ck, Ck+l, Ck+l) dCfe+1 

fe(co, ■ ■ ■ , Ck\ck+l, Ck+l) fe(ck+l, Ck+l) 



J fe(cQ, ■ ■ ■ ,Ck\Ck+l,Ck+l)fe{Ck+l,Ck+l) dCk+l 

Using the fact that when a coefficient is fully known knowing it approximately adds nothin 



(111) 

g (see 



eqs. (72), we can rewrite this as 



fe(ck+l\co, ■ ■ ■ , Ck, Ck+l) 



fe(cQ, • • • , Ck\Ck+l)fe(Ck+l,Ck 



+1, 



J*/e(cO, • • • j C k \Ck+l)fe{Ck+l,C k +i) dc k +l 

[fejcp, ■ ■ ■ ,C k ,C k +l)/ fe(c k +l)][fe(Ck+l\Ck+l)fe(c k +l)} 
f[fe(cQ, ■ ■ ■ ,C k ,C k +l)/ fe(c k +l)][fe(Ck+l\c k +l)fe(c k +l)] dc k +l 

/e(cq, ■ ■ ■ , Cfc, Ck+l) fe{c k +l\c k +l) 

/ fe{cQ, ... ,C k , C k +l) fe{ck+l\Ck+l) dCfc+l 

= 7V e / e (c fc+ i|co, . . . ,c k )f(c k +i\c k +i) (112) 
where with M € we denote the normalisation factor. In the limit e — > we can therefore write 

/(cfc+i|co, . . .,c k ,c k +i) = Mf(c k +i\c , . . . ,c k )f(c k +i\c k +i) (113) 
The density f(c n \co, . . . ,c k , Ck+i), for n > k + 1, is then simply 

/(c n |c ,...,c fc ,4. + i) = y /(c n ,c fe+ i|c ,. . . ,Cfc,c fc+ i) dc fc+ i 

= y f(c n \co, . . ■ , Ck, Ck+i, Ck+i)f(ck+i\co, . ■ ■ , Ck,Ck+i) dck+i 

= y" f(c n \co, . . . ,c k ,c k +i)f(ck+i\c , . . . ,Ck)f(c k +i\c k +i) dc k +i 

= M f(c n ,Ck+i\c , . . . ,c k )f(c k +i\ck+i) dck+i (114) 
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